{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Examining pandas DataFrame contents\n",
    "It's useful to be able to quickly examine the contents of a DataFrame. \n",
    "\n",
    "Let's start by importing the pandas library and creating a DataFrame populated with information about airports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>City</th>\n",
       "      <th>Country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Seatte-Tacoma</td>\n",
       "      <td>Seattle</td>\n",
       "      <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Dulles</td>\n",
       "      <td>Washington</td>\n",
       "      <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Heathrow</td>\n",
       "      <td>London</td>\n",
       "      <td>United Kingdom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Schiphol</td>\n",
       "      <td>Amsterdam</td>\n",
       "      <td>Netherlands</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Changi</td>\n",
       "      <td>Singapore</td>\n",
       "      <td>Singapore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Pearson</td>\n",
       "      <td>Toronto</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Narita</td>\n",
       "      <td>Tokyo</td>\n",
       "      <td>Japan</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Name        City         Country\n",
       "0  Seatte-Tacoma     Seattle             USA\n",
       "1         Dulles  Washington             USA\n",
       "2       Heathrow      London  United Kingdom\n",
       "3       Schiphol   Amsterdam     Netherlands\n",
       "4         Changi   Singapore       Singapore\n",
       "5        Pearson     Toronto          Canada\n",
       "6         Narita       Tokyo           Japan"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "airports = pd.DataFrame([\n",
    "                        ['Seatte-Tacoma', 'Seattle', 'USA'],\n",
    "                        ['Dulles', 'Washington', 'USA'],\n",
    "                        ['Heathrow', 'London', 'United Kingdom'],\n",
    "                        ['Schiphol', 'Amsterdam', 'Netherlands'],\n",
    "                        ['Changi', 'Singapore', 'Singapore'],\n",
    "                        ['Pearson', 'Toronto', 'Canada'],\n",
    "                        ['Narita', 'Tokyo', 'Japan']\n",
    "                        ],\n",
    "                        columns = ['Name', 'City', 'Country']\n",
    "                        )\n",
    "\n",
    "airports "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Returning first *n* rows\n",
    "If you have thousands of rows, you might just want to look at the first few rows\n",
    "\n",
    "* **head**(*n*) returns the top *n* rows "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>City</th>\n",
       "      <th>Country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Seatte-Tacoma</td>\n",
       "      <td>Seattle</td>\n",
       "      <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Dulles</td>\n",
       "      <td>Washington</td>\n",
       "      <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Heathrow</td>\n",
       "      <td>London</td>\n",
       "      <td>United Kingdom</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Name        City         Country\n",
       "0  Seatte-Tacoma     Seattle             USA\n",
       "1         Dulles  Washington             USA\n",
       "2       Heathrow      London  United Kingdom"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "airports.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Returning last *n* rows\n",
    "Looking at the last rows in a DataFrame can be a good way to check that all your data loaded correctly\n",
    "* **tail**(*n*) returns the last *n* rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>City</th>\n",
       "      <th>Country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Changi</td>\n",
       "      <td>Singapore</td>\n",
       "      <td>Singapore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Pearson</td>\n",
       "      <td>Toronto</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Narita</td>\n",
       "      <td>Tokyo</td>\n",
       "      <td>Japan</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Name       City    Country\n",
       "4   Changi  Singapore  Singapore\n",
       "5  Pearson    Toronto     Canada\n",
       "6   Narita      Tokyo      Japan"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "airports.tail(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Checkign number of rows and columns in DataFrame\n",
    "Sometimes you just need to know how much data you have in the DataFrame\n",
    "\n",
    "* **shape** returns the number of rows and columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(7, 3)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "airports.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting mroe detailed information about DataFrame contents\n",
    "\n",
    "* **info**() returns more detailed information about the DataFrame\n",
    "\n",
    "Information returned includes:\n",
    "* The number of rows, and the range of index values\n",
    "* The number of columns\n",
    "* For each column: column name, number of non-null values, the datatype\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 7 entries, 0 to 6\n",
      "Data columns (total 3 columns):\n",
      "Name       7 non-null object\n",
      "City       7 non-null object\n",
      "Country    7 non-null object\n",
      "dtypes: object(3)\n",
      "memory usage: 148.0+ bytes\n"
     ]
    }
   ],
   "source": [
    "airports.info()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
